PAutomaC: a PFA/HMM Learning Competition
نویسندگان
چکیده
Approximating distributions over strings is a hard learning problem. Typical techniques involve using finite state machines as models and attempting to learn these; these machines can either be hand built and then have their weights estimated, or built by grammatical inference techniques: the structure and the weights are then learned simultaneously. The PAutomaC competition, run in 2012, was the first challenge to allow comparison between methods and algorithms and built a first state of the art for these techniques. Both artificial data and real data were proposed and contestants were to try to estimate the probabilities of new strings. The purpose of this paper is to describe some of the technical and intrinsic challenges such a competition has to face, to give a broad state of the art concerning both challenges dealing with learning grammars and finite state machines and the relevant literature. This paper also provides the results of the competition and a brief description and analysis of the different approaches the main participants used. S. Verwer Institute for Computing and Information Sciences, Radboud University Nijmegen. E-mail: [email protected], R. Eyraud QARMA team, Laboratoire d’Informatique Fondamentale de Marseille. E-mail: [email protected] C. de la Higuera TALN team, Laboratoire d’Informatique de Nantes Atlantique, Nantes University. E-mail: [email protected] 2 Sicco Verwer et al.
منابع مشابه
Treba: Efficient Numerically Stable EM for PFA
Training probabilistic finite automata with the EM/Baum-Welch algorithm is computationally very intensive, especially if random ergodic automata are used initially, and additional strategies such as deterministic annealing are used. In this paper we present some optimization and parallelization strategies to the Baum-Welch algorithm that often allow for training of much larger automata with a l...
متن کاملResults of the PAutomaC Probabilistic Automaton Learning Competition
Approximating distributions over strings is a hard learning problem. Typical GI techniques involve using finite state machines as models and attempting to learn both the structure and the weights, simultaneously. The PAutomaC competition is the first challenge to allow comparison between methods and algorithms and builds a first state of the art for these techniques. Both artificial data and re...
متن کاملPAC learning of Probabilistic Automaton based on the Method of Moments
Probabilitic Finite Automata (PFA) are generative graphical models that define distributions with latent variables over finite sequences of symbols, a.k.a. stochastic languages. Traditionally, unsupervised learning of PFA is performed through algorithms that iteratively improves the likelihood like the Expectation-Maximization (EM) algorithm. Recently, learning algorithms based on the so-called...
متن کاملEvaluation of the Hidden Markov Model for Detection of P300 in EEG Signals
Introduction: Evoked potentials arisen by stimulating the brain can be utilized as a communication tool between humans and machines. Most brain-computer interface (BCI) systems use the P300 component, which is an evoked potential. In this paper, we evaluate the use of the hidden Markov model (HMM) for detection of P300. Materials and Methods: The wavelet transforms, wavelet-enhanced indepen...
متن کاملMarginalizing Out Transition Probabilities for Several Subclasses of PFAs
A Bayesian manner which marginalizes transition probabilities can be generally applied to various kinds of probabilistic finite state machine models. Based on such a Bayesian manner, we implemented and compared three algorithms: variable-length gram, state merging method for PDFAs, and collapsed Gibbs sampling for PFAs. Among those, collapsed Gibbs sampling for PFAs performed the best on the da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013